x86 virtualization
In computing, x86 virtualization is the facility that allows multiple operating systems to simultaneously share x86 processor resources in a safe and efficient manner, a facility generically known as hardware virtualization. In the late 1990s x86 virtualization was achieved by complex software techniques which overcame the processor's lack of virtualization support and attained reasonable performance. In the mid 2000s, both Intel and AMD added hardware support to their processors making virtualization software simpler, and later hardware changes provided substantial speed improvements.
Software-based virtualization
The following discussion focuses only on virtualization of protected mode of the x86 architecture.
In protected mode the operating system runs at a higher privilege such as ring 0, and applications at a lower privilege such as ring 3. Similarly, a host OS must control the processor while the guest OS' are prevented from direct access to the hardware. One approach used in x86 software-based virtualization is called ring deprivileging, which involves running the guest OS at a ring higher than 0.[1]
Three techniques made virtualization of protected mode possible:
- Binary translation is used to rewrite in terms of ring 3 instructions certain ring 0 instructions, such as POPF, that would otherwise fail silently or behave differently when executed above ring 0,[2][3]:3 making the classic trap-and-emulate virtualization impossible.[3]:1[4] To improve performance, the translated basic blocks need to be cached in a coherent way that detects code patching (used in VxDs for instance), the reuse of pages by the guest OS, or even self-modifying code.[5]
- A number of key data structures used by a processor need to be shadowed. Because most operating systems use paged virtual memory, and granting the guest OS direct access to the MMU would mean loss of control by the virtualization manager, some of the work of the x86 MMU needs to be duplicated in software for the guest OS using a technique known as shadow page tables.[6]:5[3]:2 This involves denying the guest OS any access to the actual page table entries by trapping access attempts and emulating them instead in software. The x86 architecture uses hidden state to store segment descriptors in the processor, so once the segment descriptors have been loaded into the processor, the memory from which they have been loaded may be overwritten and there is no way to get the descriptors back from the processor. Shadow descriptor tables must therefore be used to track changes made to the descriptor tables by the guest OS.[4]
- I/O device emulation: Unsupported devices on the guest OS must be emulated by a device emulator that runs in the host OS.[7]
These techniques incur some performance overhead due to lack of MMU virtualization support, as compared to a VM running on a natively virtualizable architecture such as the IBM System/370.[3]:10[8]:17 and 21
On traditional mainframes, the classic type 1 hypervisor was self-standing and did not depend on any operating system or run any user applications itself. In contrast, the first x86 virtualization products were aimed at workstation computers, and ran a guest OS inside a host OS by embedding the hypervisor in a kernel module that ran under the host OS (type 2 hypervisor).[7]
There has been some controversy whether the x86 architecture with no hardware assistance is virtualizable as described by Popek and Goldberg. VMware researchers pointed out in a 2006 ASPLOS paper that the above techniques made the x86 platform virtualizable in the sense of meeting the three criteria of Popek and Goldberg, albeit not by the classic trap-and-emulate technique.[3]:2-3 However, as of 2009 some academics claimed that it is not.[9]
A different route was taken by other systems like Denali, L4, and Xen, known as paravirtualization, which involves porting operating systems to run on the resulting virtual machine, which does not implement the parts of the actual x86 instruction set that are hard to virtualize. The paravirtualized I/O has significant performance benefits as demonstrated in the original SOSP'03 Xen paper.[10]
64-bit
To protect the memory of the hypervisor (ring 0) from a guest OS running at ring 1, segmentation must be used.[11]:22 The initial version of x86-64 (AMD64) did not allow for a software-only full virtualization due to the lack of segmentation support in long mode, which made the protection of the hypervisor's memory impossible, in particular, the protection of the trap handler that runs in the guest kernel address space.[12][13]:11 and 20 Revision D and later 64-bit AMD processors (as a rule of thumb, those manufactured in 90 nm or less) added basic support for segmentation in long mode, making it possible to run 64-bit guests in 64-bit hosts via binary translation. Intel did not add segmentation support to its x86-64 implementation (Intel 64), making 64-bit software-only virtualization impossible on Intel CPUs, but Intel VT-x support makes 64-bit hardware assisted virtualization possible on the Intel platform.[14][15]:4
On some platforms, it is possible to run a 64-bit guest on a 32-bit host OS if the underlying processor is 64-bit and supports the necessary virtualization extensions.[16]
Hardware assist
In 2005 and 2006, Intel and AMD (working independently) created new processor extensions to the x86 architecture. The first generation of x86 hardware support for virtualization addressed the issue of privileged instructions, with support for MMU virtualization added to the Chipset later.
Processor
AMD virtualization (AMD-V)
AMD developed its first generation virtualization extensions under the code name "Pacifica", and initially published them as AMD Secure Virtual Machine (SVM),[17] but later marketed them under the trademark AMD Virtualization, abbreviated AMD-V.
On May 23, 2006, AMD released the Athlon 64 ("Orleans"), the Athlon 64 X2 ("Windsor") and the Athlon 64 FX ("Windsor") as the first AMD processors to support this technology.
AMD-V capability also features on the Athlon 64 and Athlon 64 X2 family of processors with revisions "F" or "G" on socket AM2, Turion 64 X2, and Opteron 2nd generation[18] and 3rd-generation,[19] Phenom and Phenom II processors. The APU Fusion processors support AMD-V. AMD-V is not supported by any Socket 939 processors. The only Sempron processors which support it are Huron and Sargas.
AMD Opteron CPUs beginning with the Family 0x10 Barcelona line, and Phenom II CPUs, support a second generation hardware virtualization technology called Rapid Virtualization Indexing (formerly known as Nested Page Tables during its development), later adopted by Intel as Extended Page Tables (EPT).
(The CPU flag for AMD-V is "svm". This may be checked in Linux via /proc/cpuinfo
.)
Intel virtualization (VT-x)
Previously codenamed "Vanderpool", VT-x represents Intel's technology for virtualization on the x86 platform.
On November 13, 2005, Intel released two models of Pentium 4 (Model 662 and 672) as the first Intel processors to support VT-x
As of 2009[update] not all Intel processors supported VT-x, which Intel uses to segment its market.[20] Support for VT-x may even vary between different versions (as identified by Intel's sSpec Number) of the same model number.[21] [22] For a complete and up-to-date list see the Intel website.[23] Even in May, 2011, the Intel CPU P6100 which is in laptops does not support hardware virtualization [1].
With some motherboards, Intel's VT-x feature must be enabled in the BIOS before applications can make use of it.[24]
Intel started to include Extended Page Tables (EPT),[25] a technology for page-table virtualization,[26] since the Nehalem architecture.[27][28]
(The CPU flag for VT-x is "vmx". This may be checked in Linux via /proc/cpuinfo
.)
Software using AMD-V and/or Intel VT
Chipset
Memory and I/O virtualization is performed by the chipset.[29] Typically these features must be enabled by the BIOS, which must be able to support them and also be set to use them.
I/O MMU virtualization (AMD-Vi and VT-d)
An input/output memory management unit (IOMMU) enables guest virtual machines to directly use peripheral devices, such as Ethernet, accelerated graphics cards, and hard-drive controllers, through DMA and interrupt remapping. This is sometimes called PCI passthrough.[30] Both AMD and Intel have released specifications:
- AMD's I/O Virtualization Technology, "AMD-Vi", originally called "IOMMU".[31]
- Intel's "Virtualization Technology for Directed I/O" (VT-d).[32] Included in most but not all Nahalem based processors. [33]
Network virtualization (VT-c)
- Intel's "Virtualization Technology for Connectivity" (VT-c).[34]
PCI-SIG I/O Virtualization (IOV)
PCI-SIG I/O Virtualization (IOV) are a set of general (non-x86 specific) PCI Express (PCI-E) based native hardware I/O virtualization methods standardized by PCI-SIG:[35]
- Address Translation Services (ATS)
- this supports native IOV across PCI-E via address translation. It requires support for new transactions to configure such translations.
- Single Root IOV (SR-IOV)
- this supports native IOV in existing single root complex PCI-E topologies. It requires support for new device capabilities to configure multiple virtualized configuration spaces.
- Multi-Root IOV (MR-IOV)
- this supports native IOV in new topologies (e.g., blade servers) by building on SR-IOV to provide multiple root complexes which share a common PCI-E hierarchy.
In SR-IOV, the most common of these, a host VMM configures supported devices to create and allocate virtual "shadows" of their configuration spaces so that virtual machine guests can directly configure and access such "shadow" device resources.
See also
References
- ^ "Intel Virtualization Technology: Hardware Support for Efficient Processor Virtualization". Intel.com. 2006-08-10. http://www.intel.com/technology/itj/2006/v10i3/1-hardware/3-software.htm. Retrieved 2010-05-02.
- ^ "USENIX Technical Program - Abstract - Security Symposium - 2000". Usenix.org. 2002-01-29. http://www.usenix.org/events/sec2000/robin.html. Retrieved 2010-05-02.
- ^ a b c d e "A Comparison of Software and Hardware Techniques for x86 Virtualization" (PDF). VMware. http://www.vmware.com/pdf/asplos235_adams.pdf. Retrieved 8 September 2010.
- ^ a b U.S. Patent 6,397,242
- ^ U.S. Patent 6,704,925
- ^ "Virtualization: architectural considerations and other evaluation criteria" (PDF). VMware. http://www.vmware.com/pdf/virtualization_considerations.pdf. Retrieved 8 September 2010.
- ^ a b U.S. Patent 6,496,847
- ^ "VMware and Hardware Assist Technology" (PDF). http://download3.vmware.com/vmworld/2006/tac9463.pdf. Retrieved 2010-09-08.
- ^ "Implementation of a Purely Hardware-assisted VMM for x86 Architecture" (PDF). http://www.iaeng.org/publication/WCE2009/WCE2009_pp136-140.pdf. Retrieved 8 September 2010.
- ^ "Xen and the Art of Virtualization" (PDF). http://www.cl.cam.ac.uk/research/srg/netos/papers/2003-xensosp.pdf.
- ^ J. E. Smith, R. Uhlig (August 14, 2005) Virtual Machines: Architectures, Implementations and Applications, HOTCHIPS 17, Tutorial 1, part 2
- ^ "How retiring segmentation in AMD64 long mode broke VMware". Pagetable.com. 2006-11-09. http://www.pagetable.com/?p=25. Retrieved 2010-05-02.
- ^ "VMware and CPU Virtualization Technology" (PDF). VMware. http://download3.vmware.com/vmworld/2005/pac346.pdf. Retrieved 2010-09-08.
- ^ "VMware KB: Hardware and firmware requirements for 64bit guest operating systems". Kb.vmware.com. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1003945. Retrieved 2010-05-02.
- ^ "Software and Hardware Techniques for x86 Virtualization" (PDF). http://www.vmware.com/files/pdf/software_hardware_tech_x86_virt.pdf. Retrieved 2010-05-02.
- ^ VMware Server FAQs: What does 64-bit operating system support mean?, VMware Inc. retrieved on 2010-04-07
- ^ "33047_SecureVirtualMachineManual_3-0.book" (PDF). http://www.mimuw.edu.pl/~vincent/lecture6/sources/amd-pacifica-specification.pdf. Retrieved 2010-05-02.
- ^ What are the main differences between Second-Generation AMD Opteron processors and first-generation AMD Opteron processors?
- ^ What virtualization enhancements do Third-Generation AMD Opteron processors feature?
- ^ Stokes, Jon (2009-05-08). "Microsoft, Intel goof up Windows 7's "XP Mode"". Arstechnica.com. http://arstechnica.com/microsoft/news/2009/05/r2e-microsoft-intel-goof-up-windows-7s-xp-mode.ars. Retrieved 2010-05-02.
- ^ "Processor Spec Finder". Processorfinder.intel.com. http://processorfinder.intel.com/. Retrieved 2010-05-02.
- ^ "Intel Processor Number Details". Intel. Intel. 2007-12-03. http://www.intel.com/products/processor_number/chart/index.htm. Retrieved 2008-10-03.
- ^ "Intel Virtualization Technology List". Ark.intel.com. http://ark.intel.com/VTList.aspx. Retrieved 2010-05-02.
- ^ "Windows Virtual PC: Configure BIOS". Microsoft. http://www.microsoft.com/windows/virtual-pc/support/configure-bios.aspx. Retrieved 2010-09-08.
- ^ Neiger, Gil; A. Santoni, F. Leung, D. Rodgers, R. Uhlig. "Intel Virtualization Technology: Hardware Support for Efficient Processor Virtualization". Intel Technology Journal (Intel) 10 (3): 167–178. doi:10.1535/itj.1003.01. http://download.intel.com/technology/itj/2006/v10i3/v10-i3-art01.pdf. Retrieved 2008-07-06.
- ^ Gillespie, Matt (2007-11-12). "Best Practices for Paravirtualization Enhancements from Intel Virtualization Technology: EPT and VT-d". Intel Software Network. Intel. http://software.intel.com/en-us/articles/best-practices-for-paravirtualization-enhancements-from-intel-virtualization-technology-ept-and-vt-d. Retrieved 2008-07-06.
- ^ "First the Tick, Now the Tock: Next Generation Intel Microarchitecture (Nehalem)" (PDF) (Press release). Intel. http://www.intel.com/pressroom/archive/reference/whitepaper_Nehalem.pdf. Retrieved 2008-07-06.
- ^ "Technology Brief: Intel Microarchitecture Nehalem Virtualization Technology" (PDF). Intel. 2009-03-25. http://download.intel.com/business/resources/briefs/xeon5500/xeon_5500_virtualization.pdf. Retrieved 2009-11-03.
- ^ Intel platform hardware support for I/O virtualization
- ^ "Linux virtualization and PCI passthrough". IBM. http://www.ibm.com/developerworks/linux/library/l-pci-passthrough/. Retrieved 10 November 2010.
- ^ "AMD I/O Virtualization Technology (IOMMU) Specification Revision 1.26". http://support.amd.com/us/Processor_TechDocs/34434-IOMMU-Rev_1.26_2-11-09.pdf. Retrieved 2011-05-24.
- ^ Intel Virtualization Technology for Directed I/O (VT-d) Architecture Specification
- ^ Intel Virtualization Technology for Directed I/O (VT-d) Supported CPU List
- ^ Intel Virtualization Technology for Connectivity (VT-c)
- ^ PCI-SIG I/O Virtualization (IOV) Specifications